SPEECH 88 Edinburgh 22 / 26 August 1988 SPEECH FUNDAMENTAL PERIOD ESTIMATION USING A TRAINABLE PATTERN CLASSIFIER
نویسنده
چکیده
This paper describes a pattern recognition algorithm for the location of the points of vocal fold closure in a noisy speech signal. The algorithm uses a multi-layer perceptron (MU) classifier with inputs from a window on the speech signal, and an output signifying the presence of a vocal fold closure at the centre of the window. The location of the vocal fold closures, or the fundamental period epoch markers are given the name Ik, and our perceptron algorithm the name MLP-Tx. In this paper, we describe the MLPTx algorithm in relation to other methods for fundamental frequency estimation, and compare its performance with another algorithm for Tx estimation from speech. The MLP-Tx algorithm is shown to have good performance in noise and potential application in a signal-processing hearing-aid.
منابع مشابه
Layered neural nets for pattern recognition - Acoustics, Speech and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Tr
Adaptive threshold logic elements called ADALINES can be used in trainable pattern recognition systems. Adaptation by the LMS (least mean squares) algorithm is discussed. Threshold logic elements only realize linearly separable functions. To implement more elaborate classification functions, multilayered ADALINE networks can be used. A pattern recognition concept involving first an “invariance ...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملCipher text only attack on speech time scrambling systems using correction of audio spectrogram
Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...
متن کاملA Trainable Speech Enhanceme Mixture Models for Spe
Our work introduces a trainable speech enhancement technique that can directly incorporate information about the long-term, time-frequency characteristics of speech signals prior to the enhancement process. We approximate noise spectral magnitude from available recordings from the operational environment as well as clean speech from a clean database with mixtures of Gaussian pdfs using the Expe...
متن کاملNoise perturbation for supervised speech separation
Speech separation can be treated as a mask estimation problem, where interference-dominant portions are masked in a time-frequency representation of noisy speech. In supervised speech separation, a classifier is typically trained on a mixture set of speech and noise. It is important to efficiently utilize limited training data to make the classifier generalize well. When target speech is severe...
متن کامل